AMENDMENTS TO THE CLAIMS 

The following listing of claims will replace all prior versions and listings of claims 
in the application. 

Listing Of Claims 

1 . (Currently Amended) A system for tuning the text-to-speech conversion 
process, the system comprising: 

a text-to-speech engine, said text-to-speech engine receiving at least one text- 
input and converting said text-input into a processed representation, 

said processed representation including at least one speech feature associated 
with at least one segment of said representation; and 

a visual editing interface, said visual editing interface displaying said processed 
representation using at least one graphical indicator on an output device, wherein said 
segment is displayed on said output device using said graphical indicator corresponding 
to said speech feature , thereby communicating variations in one or more types of 
speech features associated with segments of said representation by varying visual 
display properties of the segments . 

2. (Original) The system of Claim 1 wherein said visual editing interface 
provides at least one editing function to a user, the editing function enabling [[the]] 
modification of said speech feature associated with said segment through a change in 
the corresponding said graphical indicator. 
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3. (Original) The system of Claim 2 wherein said visual editing interface 
associates said speech feature corresponding to said segment with said graphical 
indicator, wherein the user's modification of said graphical indicator results in a 
corresponding change in said speech feature of said segment. 

4. (Original) The system of Claim 1 wherein said speech feature is at least 
one of the following: normalized text, part-of-speech, parsing of text, chunking of text, 
boundary strength, pause duration, transcription, speech rate, syllable duration, 
segment duration, pitch, word prominence, emphasis, formant mixing mode, unit 
selection override, intensity contour, formant trajectories, and allophone rules. 

5. (Original) The system of Claim 1 wherein said graphical indicator 
comprises at least one of the following: graphical style, font faces, coloring, vertical 
spacing, horizontal spacing, italicization, boldness, underlining, blinking, crossing-out, 
text orientation, text rotation, punctuation symbols and graphical symbols. 

6. (Original) The system of Claim 1 wherein said processed representation 
employs a parameterized aligned sound records format. 

7. (Original) The system of Claim 1 wherein said segment comprises at least 
one of the following: word, letter, syllable, pause, word boundary and punctuation-mark. 
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8. (Original) The system of Claim 1 wherein said visual editing interface 
operates as a plug-in for a graphical user interface. 

9. (Original) The system of Claim 8 wherein said plug-in is an ActiveX 
control. 

10. (Currently Amended) The system of Claim 1 wherein said visual editing 
interface allows e d i t i ng definition of said input-text by providing a set of text messages 
containing non-editable text and editable blank slots into which at least part of said 
input-text can be entered wh e r e in sa i d i nput - text conta i ns at le ast on e non -e ditab le said 
t o xt s o gm o nt and at l o ast on o o d i tab lo said s o gm o nt . 

1 1 . (Original) The system of Claim 1 wherein said visual editing interface is 
language independent. 

12. (Original) The system of Claim 1 wherein said visual editing interface 
provides the user with speech audio output of said processed representation. 

13. (Original) The system of Claim 1 wherein visual editing interface is 
connected to a data-store for storing and retrieving said representation. 



Serial No. 10/776,892 



Page 4 of 13 



14. (Currently Amended) The system of Claim 1 wherein [[the]] said 
processed representation is a modified textual representation of the processed input- 
text . 

15. (Currently Amended) The system of Claim 14 wherein the input text s aid 



16. (Currently Amended) The system of Claim 15 wherein said modified 
textual representation is stored and accessed from a data store. 

17. (Currently Amended) The system of Claim 14 wherein said modified 
textual representation is used to generate synthesized speech using a TTS system 
distinct from said text-to-speech engine. 
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-is used to generate said processed representation. 



1 8. (Currently Amended) A system for providing a text-to-speech interface, the 
system comprising: 

a visual interface connected to a text-to-speech engine; and 
at least one communication channel connecting said visual interface to said text- 
to-speech engine, said text-to-speech engine communicating with said visual interface 
over said communication channel by sending and receiving at least one data segment 
in a format^ 

wherein said visual interface communicates variations in one or more types of 
speech features associated with segments of said data by varying visual display 
properties of the segments . 

19. (Original) The system of claim 18 wherein said format of said data 
segment is a parameterized aligned sound records format. 

20. (Original) The system of claim 18 wherein said text-to-speech engine 
sends said data segment in the parameterized aligned sound records format to said 
visual interface, said visual interface rendering said data segment in a visual form, said 
visual interface allowing editing of said data segment to produce an edited data 
segment, said visual interface sending said edited data segment to said text-to-speech 
engine. 
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21 . (Original) The system of claim 18 wherein said visual interface sends data 
to said text-to-speech engine over a first said communication channel and said text-to- 
speech engine sends data to said visual interface over a second said communication 
channel. 

22. (Currently Amended) A method for visual tuning text-to-speech conversion 
process, the method comprising: 

converting an input-text to a processed representation using a text-to-speech 
engine, said processed representation including at least one speech feature of said 
input-text; 

displaying said processed representation on a visual editing interface connected 
to said text-to-speech engine, said speech feature of said processed representation 
being displayed in a corresponding graphical form , thereby communicating variations in 
one or more types of speech features associated with segments of said representation 
by varying visual display properties of the segments ; and 

providing an editing function in said visual editing interface to a user for modifying 
said speech feature in said graphical form. 

23. (Original) The method of Claim 22 further comprising: 

generating speech audio equivalent of said processed representation through 
said visual editing interface. 
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24. (Original) The method of Claim 22 further comprising: 
saving said processed representation in a data store; and 

loading said processed representation stored in said data store into said visual 
editing interface. 

25. (Currently Amended) The method of Claim 22 further comprising: 
converting said processed representation into a modified textual representation 

of the processed input-text . 

26. (Currently Amended) The method of Claim 25 further comprising: 
converting said textual representation into a processed representation. 

wherein the input text sa i d textua l representation is used to generate said processed 
representation. 

27. (Currently Amended) The method of Claim 25 further comprising: 
storing said modified textual representation in a data store; and 

loading said modified textual representation stored in said data store into said 
visual editing interface. 

28. (Currently Amended) The method of Claim 25 further comprising: 

using said modified textual representation to synthesize speech using a TTS 
system distinct from said text-to-speech engine. 
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29. (New) The system of claim 1, wherein said visual editing interface displays 
a modified textual representation of said text-input, and variations in visual display for 
communicating different speech features individually associated with different textual 
segments of the textual representation include a combination of at least two of: (a) 
variations in graphical length of the textual segments; (b) variations in vertical positions 
of the textual segments; (c) variations in horizontal spacing of the textual segments; (d) 
variations in font faces of the textual segments; (e) variations in coloring of the textual 
segments; (f) variations in styles of the textual segments; (g) variations in orientation of 
the textual segments; (h) variations in rotation of the textual segments; or (i) punctuation 
of the textual segments. 
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